This Jupyter Notebook aims to provide a workflow for processing single-cell RNAseq data. Single-cell sequencing analysis is a relatively new area with many ongoing changes happening. Multiple tools and methods have been developed to deal with single-cell data, and there are many interesting topics for scRNAseq analysis. In this Notebook, we will mainly focus on the following selected topics:
We are using the SoS kernel in this notebook, so we will use a mixture of R and bash commands and will invoke them with %use. I will set the working directory in both R and bash again:
%use r
mydir<- getwd()
setwd(mydir)
R_zmq_bind errno: 98 strerror: Address already in use
Warning message:
In zmq.bind(sockets$shell, url_with_port("shell_port")) :
zmq.bind fails, tcp://127.0.0.1:59701
%use bash
mydir=`pwd`
cd $mydir
Before we can run any analysis, we need to load the necessary R packages. The list of packages are loaded with library() function.
The following code might generate multiple message when loading the packages,such as "The following objects are masked from XXX", which is normal.You can ignore them.
%use r
library(DESeq2)
library(dplyr)
library(edgeR)
library(Seurat)
library(cowplot)
library(MetaDE)
library(patchwork)
Loading required package: S4Vectors
Loading required package: stats4
Loading required package: BiocGenerics
Attaching package: ‘BiocGenerics’
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, basename, cbind, colnames, dirname, do.call,
duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
tapply, union, unique, unsplit, which.max, which.min
Attaching package: ‘S4Vectors’
The following objects are masked from ‘package:base’:
I, expand.grid, unname
Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats
Attaching package: ‘MatrixGenerics’
The following objects are masked from ‘package:matrixStats’:
colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
colWeightedMeans, colWeightedMedians, colWeightedSds,
colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
rowWeightedSds, rowWeightedVars
Loading required package: Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
Attaching package: ‘Biobase’
The following object is masked from ‘package:MatrixGenerics’:
rowMedians
The following objects are masked from ‘package:matrixStats’:
anyMissing, rowMedians
Attaching package: ‘dplyr’
The following object is masked from ‘package:Biobase’:
combine
The following object is masked from ‘package:matrixStats’:
count
The following objects are masked from ‘package:GenomicRanges’:
intersect, setdiff, union
The following object is masked from ‘package:GenomeInfoDb’:
intersect
The following objects are masked from ‘package:IRanges’:
collapse, desc, intersect, setdiff, slice, union
The following objects are masked from ‘package:S4Vectors’:
first, intersect, rename, setdiff, setequal, union
The following objects are masked from ‘package:BiocGenerics’:
combine, intersect, setdiff, union
The following objects are masked from ‘package:stats’:
filter, lag
The following objects are masked from ‘package:base’:
intersect, setdiff, setequal, union
Loading required package: limma
Attaching package: ‘limma’
The following object is masked from ‘package:DESeq2’:
plotMA
The following object is masked from ‘package:BiocGenerics’:
plotMA
The legacy packages maptools, rgdal, and rgeos, underpinning this package
will retire shortly. Please refer to R-spatial evolution reports on
https://r-spatial.org/r/2023/05/15/evolution4.html for details.
This package is now running under evolution status 0
Attaching SeuratObject
Attaching package: ‘Seurat’
The following object is masked from ‘package:SummarizedExperiment’:
Assays
Loading required package: survival
Loading required package: impute
Loading required package: combinat
Attaching package: ‘combinat’
The following object is masked from ‘package:utils’:
combn
Loading required package: tools
Attaching package: ‘patchwork’
The following object is masked from ‘package:cowplot’:
align_plots
The data used for this notebook is single-cell RNA-seq data from Human prostate carcinoma-associated fibroblasts. Carcinoma-associated fibroblasts (CAF) are a heterogeneous group of cells within the tumor microenvironment (TME) that can promote tumorigenesis in the prostate. Please refer to this paper if you would like to know details of this data.
Cell Ranger was developed and maintained by 10x Genomics. it provides a set of pipelines to process and analzye raw scRNA-seq data. Below, we provide some example codes for using Cell Ranger and more details can be found in 10x Genomics website here.
If you would like to try them yourselves, you will need to download the data and modify the output paths accordingly to your own directory.
The FASTQ data files are already prepared and saved in "/anvil/projects/x-tra220018/2022/datasets/single_cellData/Ratliff_CAF/". You don't have access to make changes in this directory. But you can try codes by changing the output path to your directory. Below is the directory structure:
030386_Control-CAF_S1_run656_L001_R1_001.fastq.gz
030386_Control-CAF_S1_run656_L001_R2_001.fastq.gz
030386_Control-CAF_S1_run656_L002_R1_001.fastq.gz
030386_Control-CAF_S1_run656_L002_R2_001.fastq.gz
030386_Control-CAF_S1_run656_L003_R1_001.fastq.gz
030386_Control-CAF_S1_run656_L003_R2_001.fastq.gz
030386_Control-CAF_S1_run656_L004_R1_001.fastq.gz
030386_Control-CAF_S1_run656_L004_R2_001.fastq.gz
030386_Control-CAF_S1_run659_L001_R1_001.fastq.gz
030386_Control-CAF_S1_run659_L001_R2_001.fastq.gz
030386_Control-CAF_S1_run659_L002_R1_001.fastq.gz
030386_Control-CAF_S1_run659_L002_R2_001.fastq.gz
030386_Control-CAF_S1_run659_L003_R1_001.fastq.gz
030386_Control-CAF_S1_run659_L003_R2_001.fastq.gz
030386_Control-CAF_S1_run659_L004_R1_001.fastq.gz
030386_Control-CAF_S1_run659_L004_R2_001.fastq.gz
Please notice that cell ranger requires the input FASTQ files to have a special naming convention of bcl2fastq or mkfastq: eg. Sample_S1_L00X_R1_001.fastq.gz. Briefly, FASTQ files taken by cellranger count are named with the sample name and number, the flow cell lane, and read. The file extension is '*.fastq.gz'. An example of FASTQ file name looks like this: samplename_S1_L001_R1_001.fastq.gz.
Here is the explaination for each element in the name:
Special Note: L001 and L002 are indices of different Illumina sequencing lanes or batches, and we can use these indices as well as sample indices as means by which to distinguish treatment groups. If we want to analyze all samples in one treatment group together, they will be assigned the same sample number (e.g. S1) and different lane number (e.g. L001 and L002). (NOTE: Reads cannot be assigned as sample number 0 or lane number 0. If it has number 0, it will be excluded from downstream analysis.) For example, if there are 2 treatment groups each has 3 replicates, we will index all three replicates in group 1 as S1_L001, S1_L002, S1_L003 and replicates in group 2 as S2_L001, S2_L002 and S3_L003.
Please refer to Illumina or bcl2fastq User Guide for more details.
This code downloads the reference genome from 10x Genomics website. We used the most recent release GRCh38.
%use bash
# ref_path=/anvil/projects/x-tra220018/2022/ref_files/cellranger
# wget https://cf.10xgenomics.com/supp/cell-exp/refdata-gex-GRCh38-2020-A.tar.gz -P $ref_path
# tar -zxvf $ref_path/refdata-gex-GRCh38-2020-A.tar.gz -C $ref_path
cellranger count¶We will use cellranger count command to generate single cell feature counts data from FASTQ files. The following codes use the FASTQ files listed above to map the RNAseq gene reads to the downloaded reference genome GRCh38. Below, we show an example of a script for cellranger count that will be submitted to bash shell on server. The codes before cellranger count aim to configure the bash script's running environment.
The cellranger need to be submitted to run on a server's backend. We already in the backend once we typed startnode. So we don't need to worry about the setup. But in case you will run this program in your instution server, please read the following part.
For submitting cell ranger jobs to the backend of a server, you will need to setup the running environment in your job script before running. Many supercomputers use either SLURM or PBS submission systems. Anvil and other Purdue-based systems use the SLURM submission system. You do not need to run or use the below codes, these headers are simply shown as examples. For some of you who want to use your own institutions’ supercomputers after the workshop, you can use these headers as a reference to make your own job submission scripts.
An example of job submission scripts is shown below. It starts with !/bin/sh -l, then specify the necessary job parameters. You can refer to this page for SLURM job submission script, and this page for creating a PBS bash script. You will need to modify these parameters according to the computing environment in your institution server.
# SLURM job submission script
#!/bin/sh -l
#SBATCH -p standard
#SBATCH -N 1
#SBATCH -n 40
#SBATCH --time=4:00:00
# PBS job submission script
#!/bin/sh
#PBS -q long
#PBS -l nodes=1:ppn=10
#PBS -l walltime=4:00:00
#PBS -M XXX@purdue.edu
# cd $PBS_O_WORKDIR
We are already on the backend, so we can run the cellranger directly. We will load the pre-installed software using the module load command. (If you run it on other supercomputers, you will need to install cellranger first.) The data manipulation step is completed with the argument beginning with cellranger count.
The cellranger arguments are broken up into multiple lines for easy reading.
Important arguements we used for cellranger count are listed below.
Cellranger doesn't support specifying the output directory in the code. We will have to set the output directory and cd to the output folder before calling cellranger. Then cd back to the current notebook path once it is done.
(The unset MPLBACKEND is for unsetting an environment variable in Jupyter Notebook that confuses cellranger.)
%use bash
mydir=`pwd`
unset MPLBACKEND
MRO_DISK_SPACE_CHECK=disable
module load biocontainers
module load cellranger/6.1.1
ref_path=/anvil/projects/x-tra220018/ref_files/cellranger/
data_path=/anvil/projects/x-tra220018/current/datasets/single_cellData/example
out_path=./data/cellranger
cd $out_path
cellranger count --id=Control-CAF \
--sample=030386_Control-CAF \
--transcriptome=$ref_path/refdata-gex-GRCh38-2020-A/ \
--fastqs=$data_path \
--expect-cells=5000 \
cd $mydir
User guides for each biocontainer module can be found in https://www.rcac.purdue.edu/knowledge/biocontainers Martian Runtime - v4.0.6 2025-03-05 01:01:29 [jobmngr] WARNING: configured to use 35GB of local memory, but only 24.0GB is currently available. Serving UI at http://a204.anvil.rcac.purdue.edu:42285?auth=0U_95By4wTHSoeUKuQAbhVf-kTe9UvHqpKFLw7hMBoM Running preflight checks (please wait)... Checking sample info... Checking FASTQ folder... Checking reference... Checking reference_path (/anvil/projects/x-tra220018/ref_files/cellranger/refdata-gex-GRCh38-2020-A) on a204.anvil.rcac.purdue.edu... Checking optional arguments... mrc: v4.0.6 mrp: v4.0.6 Anaconda: Python 3.8.2 numpy: 1.19.2 scipy: 1.6.2 pysam: 0.16.0.1 h5py: 3.2.1 pandas: 1.2.4 STAR: 2.7.2a samtools: samtools 1.10 Using htslib 1.10.2 Copyright (C) 2019 Genome Research Ltd. 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.SANITIZE_MAP_CALLS 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.DISABLE_BAMS 2025-03-05 01:01:34 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.DISABLE_BAMS.fork0.chnk0.main 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.WRITE_GENE_INDEX 2025-03-05 01:01:34 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.WRITE_GENE_INDEX.fork0.chnk0.main 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.FULL_COUNT_INPUTS.WRITE_GENE_INDEX 2025-03-05 01:01:34 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.FULL_COUNT_INPUTS.WRITE_GENE_INDEX.fork0.chnk0.main 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MAKE_FULL_CONFIG._MAKE_VDJ_CONFIG 2025-03-05 01:01:34 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MAKE_FULL_CONFIG._MAKE_VDJ_CONFIG.fork0.chnk0.main 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.DETECT_COUNT_CHEMISTRY 2025-03-05 01:01:34 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.DETECT_COUNT_CHEMISTRY.fork0.chnk0.main 2025-03-05 01:01:34 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.DISABLE_BAMS 2025-03-05 01:01:34 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MAKE_FULL_CONFIG._MAKE_VDJ_CONFIG 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.MULTI_SETUP_CHUNKS 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.MULTI_SETUP_CHUNKS 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.MAKE_SHARD 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.MAKE_SHARD 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.BARCODE_CORRECTION 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.BARCODE_CORRECTION 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.RUST_BRIDGE 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.RUST_BRIDGE 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.ASSEMBLE_VDJ 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.MERGE_METRICS 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.ASSEMBLE_VDJ 2025-03-05 01:01:34 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.MERGE_METRICS 2025-03-05 01:01:59 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.FULL_COUNT_INPUTS.WRITE_GENE_INDEX 2025-03-05 01:01:59 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR.PARSE_TARGET_FEATURES 2025-03-05 01:01:59 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR.PARSE_TARGET_FEATURES.fork0.chnk0.main 2025-03-05 01:01:59 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.WRITE_GENE_INDEX 2025-03-05 01:02:01 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR.PARSE_TARGET_FEATURES 2025-03-05 01:02:20 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.DETECT_COUNT_CHEMISTRY 2025-03-05 01:02:20 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.CHECK_BARCODES_COMPATIBILITY 2025-03-05 01:02:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.CHECK_BARCODES_COMPATIBILITY.fork0.chnk0.main 2025-03-05 01:02:20 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR._GEM_WELL_CHEMISTRY_DETECTOR.CHECK_BARCODES_COMPATIBILITY 2025-03-05 01:02:20 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR.COMBINE_GEM_WELL_CHEMISTRIES 2025-03-05 01:02:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR.COMBINE_GEM_WELL_CHEMISTRIES.fork0.chnk0.main 2025-03-05 01:02:20 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_CHEMISTRY_DETECTOR.COMBINE_GEM_WELL_CHEMISTRIES 2025-03-05 01:02:20 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR.MULTI_SETUP_CHUNKS 2025-03-05 01:02:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR.MULTI_SETUP_CHUNKS.fork0.chnk0.main 2025-03-05 01:02:20 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.SPLIT_VDJ_INPUTS 2025-03-05 01:02:21 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR.MULTI_SETUP_CHUNKS 2025-03-05 01:02:21 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.MAKE_SHARD 2025-03-05 01:02:21 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.MAKE_SHARD.fork0.split 2025-03-05 01:02:27 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.MAKE_SHARD 2025-03-05 01:02:27 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.MAKE_SHARD.fork0.chnk0.main 2025-03-05 01:08:06 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.MAKE_SHARD 2025-03-05 01:08:06 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.MAKE_SHARD.fork0.join 2025-03-05 01:08:06 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.MAKE_SHARD 2025-03-05 01:08:06 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.BARCODE_CORRECTION 2025-03-05 01:08:06 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.BARCODE_CORRECTION.fork0.split 2025-03-05 01:08:06 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.BARCODE_CORRECTION 2025-03-05 01:08:06 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.BARCODE_CORRECTION.fork0.chnk0.main 2025-03-05 01:08:06 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.BARCODE_CORRECTION.fork0.chnk1.main 2025-03-05 01:08:06 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.BARCODE_CORRECTION.fork0.chnk2.main 2025-03-05 01:08:13 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.BARCODE_CORRECTION 2025-03-05 01:08:13 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.BARCODE_CORRECTION.fork0.join 2025-03-05 01:08:13 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.BARCODE_CORRECTION 2025-03-05 01:08:13 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_BARCODE_INDEX 2025-03-05 01:08:13 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_BARCODE_INDEX.fork0.chnk0.main 2025-03-05 01:08:13 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER._SLFE_PARTIAL_FIRST_PASS.SUBSAMPLE_BARCODES 2025-03-05 01:08:13 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.SET_ALIGNER_SUBSAMPLE_RATE 2025-03-05 01:08:13 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.SET_ALIGNER_SUBSAMPLE_RATE.fork0.chnk0.main 2025-03-05 01:08:13 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER._SLFE_PARTIAL_FIRST_PASS.INITIAL_ALIGN_AND_COUNT 2025-03-05 01:08:13 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.SET_ALIGNER_SUBSAMPLE_RATE 2025-03-05 01:08:13 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER._SLFE_PARTIAL_FIRST_PASS.SET_TARGETED_UMI_FILTER 2025-03-05 01:08:13 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.ALIGN_AND_COUNT 2025-03-05 01:08:13 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.ALIGN_AND_COUNT.fork0.split 2025-03-05 01:08:13 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.ALIGN_AND_COUNT 2025-03-05 01:08:13 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.ALIGN_AND_COUNT.fork0.chnk0.main 2025-03-05 01:08:13 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.ALIGN_AND_COUNT.fork0.chnk1.main 2025-03-05 01:08:13 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.ALIGN_AND_COUNT.fork0.chnk2.main 2025-03-05 01:08:13 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_BARCODE_INDEX 2025-03-05 01:14:15 [runtime] (update) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.ALIGN_AND_COUNT.fork0 chunks running (0/3 completed) 2025-03-05 01:20:29 [runtime] (update) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.ALIGN_AND_COUNT.fork0 chunks running (2/3 completed) 2025-03-05 01:24:37 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.ALIGN_AND_COUNT 2025-03-05 01:24:37 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.ALIGN_AND_COUNT.fork0.join 2025-03-05 01:24:37 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.ALIGN_AND_COUNT 2025-03-05 01:24:37 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.WRITE_POS_BAM 2025-03-05 01:24:37 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.WRITE_POS_BAM.fork0.split 2025-03-05 01:24:37 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.COLLATE_METRICS 2025-03-05 01:24:37 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.COLLATE_METRICS.fork0.split 2025-03-05 01:24:37 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_H5_MATRIX 2025-03-05 01:24:37 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_H5_MATRIX.fork0.chnk0.main 2025-03-05 01:24:37 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_MATRIX_MARKET 2025-03-05 01:24:37 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_MATRIX_MARKET.fork0.chnk0.main 2025-03-05 01:24:37 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_BARCODE_SUMMARY 2025-03-05 01:24:37 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_BARCODE_SUMMARY.fork0.chnk0.main 2025-03-05 01:24:37 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.COLLATE_METRICS 2025-03-05 01:24:37 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.COLLATE_METRICS.fork0.chnk0.main 2025-03-05 01:24:38 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.WRITE_POS_BAM 2025-03-05 01:24:38 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.WRITE_POS_BAM.fork0.chnk0.main 2025-03-05 01:24:38 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.WRITE_POS_BAM.fork0.chnk1.main 2025-03-05 01:24:38 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.WRITE_POS_BAM.fork0.chnk2.main 2025-03-05 01:24:41 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.COLLATE_METRICS 2025-03-05 01:24:41 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.COLLATE_METRICS.fork0.join 2025-03-05 01:24:41 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.COLLATE_METRICS 2025-03-05 01:24:41 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.MERGE_METRICS 2025-03-05 01:24:41 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.MERGE_METRICS.fork0.chnk0.main 2025-03-05 01:24:41 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.MERGE_METRICS 2025-03-05 01:24:47 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_H5_MATRIX 2025-03-05 01:24:47 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.FILTER_BARCODES 2025-03-05 01:24:47 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.FILTER_BARCODES.fork0.split 2025-03-05 01:24:50 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_MATRIX_MARKET 2025-03-05 01:24:53 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.FILTER_BARCODES 2025-03-05 01:24:53 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.FILTER_BARCODES.fork0.join 2025-03-05 01:25:45 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._MATRIX_COMPUTER.WRITE_BARCODE_SUMMARY 2025-03-05 01:26:28 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.FILTER_BARCODES 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_T_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.HANDLE_GEX_CELLS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.VDJ_B_GEM_WELL_PROCESSOR.SC_VDJ_CONTIG_ASSEMBLER.HANDLE_GEX_CELLS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_B_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.RUN_ENCLONE 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_T_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.RUN_ENCLONE 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.WRITE_MOLECULE_INFO 2025-03-05 01:26:28 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.WRITE_MOLECULE_INFO.fork0.chnk0.main 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.INFER_GEM_WELL_THROUGHPUT 2025-03-05 01:26:28 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.INFER_GEM_WELL_THROUGHPUT.fork0.split 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.VDJ_T_REPORTER.VLOUPE_PREPROCESS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_B_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.FILL_CLONOTYPE_INFO 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_B_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.WRITE_CONSENSUS_TXT 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_T_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.WRITE_CLONOTYPE_OUTS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_T_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.FILL_CLONOTYPE_INFO 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_B_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.WRITE_CLONOTYPE_OUTS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_T_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.WRITE_CONSENSUS_TXT 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.VDJ_B_REPORTER.VLOUPE_PREPROCESS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_B_CLONOTYPE_ASSIGNER.HANDLE_NO_VDJ_REF 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_B_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.WRITE_CONSENSUS_BAM 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COPY_VDJ_REFERENCE 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_T_CLONOTYPE_ASSIGNER.HANDLE_NO_VDJ_REF 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_T_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.WRITE_ANN_CSV 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_B_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.WRITE_ANN_CSV 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_T_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.WRITE_CONSENSUS_BAM 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_B_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.WRITE_CONCAT_REF_OUTS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_T_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.WRITE_CONCAT_REF_OUTS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.VDJ_T_REPORTER.WRITE_CONTIG_OUTS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.VDJ_B_REPORTER.WRITE_CONTIG_OUTS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_T_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.CREATE_AIRR_TSV 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.VDJ_T_REPORTER.REPORT_CONTIGS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.VDJ_B_CLONOTYPE_ASSIGNER.CLONOTYPE_ASSIGNER.CREATE_AIRR_TSV 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.VDJ_B_REPORTER.REPORT_CONTIGS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.VDJ_T_REPORTER.SUMMARIZE_VDJ_REPORTS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.VDJ_B_REPORTER.SUMMARIZE_VDJ_REPORTS 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.VDJ_T_REPORTER.WRITE_CONTIG_PROTO 2025-03-05 01:26:28 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.VDJ_B_REPORTER.WRITE_CONTIG_PROTO 2025-03-05 01:26:32 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.INFER_GEM_WELL_THROUGHPUT 2025-03-05 01:26:32 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.INFER_GEM_WELL_THROUGHPUT.fork0.join 2025-03-05 01:26:36 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.INFER_GEM_WELL_THROUGHPUT 2025-03-05 01:26:36 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.CALL_TAGS_MARGINAL 2025-03-05 01:26:36 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.CALL_TAGS_MARGINAL.fork0.split 2025-03-05 01:26:39 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.CALL_TAGS_MARGINAL 2025-03-05 01:26:39 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.CALL_TAGS_MARGINAL.fork0.join 2025-03-05 01:27:32 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.CALL_TAGS_MARGINAL 2025-03-05 01:27:39 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.WRITE_MOLECULE_INFO 2025-03-05 01:27:39 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.CALL_TAGS_JIBES 2025-03-05 01:27:39 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.CALL_TAGS_JIBES.fork0.split 2025-03-05 01:27:39 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUBSAMPLE_READS 2025-03-05 01:27:39 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUBSAMPLE_READS.fork0.split 2025-03-05 01:27:42 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUBSAMPLE_READS 2025-03-05 01:27:42 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUBSAMPLE_READS.fork0.chnk0.main 2025-03-05 01:27:42 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUBSAMPLE_READS.fork0.chnk1.main 2025-03-05 01:27:42 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUBSAMPLE_READS.fork0.chnk2.main 2025-03-05 01:27:42 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUBSAMPLE_READS.fork0.chnk3.main 2025-03-05 01:27:42 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.WRITE_POS_BAM 2025-03-05 01:27:42 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.WRITE_POS_BAM.fork0.join 2025-03-05 01:27:43 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.CALL_TAGS_JIBES 2025-03-05 01:27:43 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.CALL_TAGS_JIBES.fork0.join 2025-03-05 01:27:48 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.CALL_TAGS_JIBES 2025-03-05 01:27:48 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.DETERMINE_SAMPLE_ASSIGNMENTS 2025-03-05 01:27:48 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.DETERMINE_SAMPLE_ASSIGNMENTS.fork0.chnk0.main 2025-03-05 01:27:49 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.DETERMINE_SAMPLE_ASSIGNMENTS 2025-03-05 01:27:49 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.MULTI_WRITE_PER_SAMPLE_MATRICES 2025-03-05 01:27:49 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.MULTI_COLLATE_PER_SAMPLE_METRICS 2025-03-05 01:27:49 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.MULTI_WRITE_PER_SAMPLE_BAM 2025-03-05 01:27:49 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.COMPUTE_EXTRA_MULTIPLEXING_METRICS 2025-03-05 01:27:49 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.COMPUTE_EXTRA_MULTIPLEXING_METRICS.fork0.split 2025-03-05 01:27:49 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.MULTI_WRITE_PER_SAMPLE_MOLECULE_INFO 2025-03-05 01:27:49 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.STRUCTIFY_PER_SAMPLE_OUTS 2025-03-05 01:27:49 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.STRUCTIFY_PER_SAMPLE_OUTS.fork0.chnk0.main 2025-03-05 01:27:50 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.STRUCTIFY_PER_SAMPLE_OUTS 2025-03-05 01:27:50 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.DISABLE_FEATURE_STAGES 2025-03-05 01:27:50 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.DISABLE_FEATURE_STAGES.fork0.chnk0.main 2025-03-05 01:27:50 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.DISABLE_FEATURE_STAGES 2025-03-05 01:27:50 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.ANALYZER_PREFLIGHT 2025-03-05 01:27:50 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.ANALYZER_PREFLIGHT.fork0.chnk0.main 2025-03-05 01:27:50 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER._TARGETED_ANALYZER.SUBSAMPLE_OFF_TARGET_READS 2025-03-05 01:27:50 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER._TARGETED_ANALYZER.SUBSAMPLE_ON_TARGET_READS 2025-03-05 01:27:50 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER._ANTIBODY_ANALYZER.SUMMARIZE_ANTIBODY_ANALYSIS 2025-03-05 01:27:50 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER._ANTIBODY_ANALYZER.CALL_ANTIBODIES 2025-03-05 01:27:51 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.COMPUTE_EXTRA_MULTIPLEXING_METRICS 2025-03-05 01:27:51 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.COMPUTE_EXTRA_MULTIPLEXING_METRICS.fork0.join 2025-03-05 01:27:52 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.ANALYZER_PREFLIGHT 2025-03-05 01:27:52 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.PREPROCESS_MATRIX 2025-03-05 01:27:52 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.PREPROCESS_MATRIX.fork0.split 2025-03-05 01:27:53 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.COMPUTE_EXTRA_MULTIPLEXING_METRICS 2025-03-05 01:27:53 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.MERGE_METRICS 2025-03-05 01:27:53 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.MERGE_METRICS.fork0.chnk0.main 2025-03-05 01:27:53 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._ASSIGN_TAGS.MERGE_METRICS 2025-03-05 01:27:55 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.WRITE_POS_BAM 2025-03-05 01:27:56 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.PREPROCESS_MATRIX 2025-03-05 01:27:56 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.PREPROCESS_MATRIX.fork0.join 2025-03-05 01:28:02 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.PREPROCESS_MATRIX 2025-03-05 01:28:02 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_PCA 2025-03-05 01:28:02 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_PCA.fork0.split 2025-03-05 01:28:02 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_FBPCA 2025-03-05 01:28:02 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_MULTIGENOME_ANALYSIS 2025-03-05 01:28:02 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_MULTIGENOME_ANALYSIS.fork0.split 2025-03-05 01:28:02 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.CORRECT_CHEMISTRY_BATCH 2025-03-05 01:28:02 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_PCA 2025-03-05 01:28:02 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_PCA.fork0.join 2025-03-05 01:28:04 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_MULTIGENOME_ANALYSIS 2025-03-05 01:28:04 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_MULTIGENOME_ANALYSIS.fork0.join 2025-03-05 01:28:06 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_MULTIGENOME_ANALYSIS 2025-03-05 01:28:07 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_PCA 2025-03-05 01:28:07 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.CHOOSE_DIMENSION_REDUCTION_OUTPUT 2025-03-05 01:28:07 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.CHOOSE_DIMENSION_REDUCTION_OUTPUT.fork0.chnk0.main 2025-03-05 01:28:07 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.CHOOSE_DIMENSION_REDUCTION_OUTPUT 2025-03-05 01:28:08 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_UMAP 2025-03-05 01:28:08 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_UMAP.fork0.split 2025-03-05 01:28:08 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_GRAPH_CLUSTERING 2025-03-05 01:28:08 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_GRAPH_CLUSTERING.fork0.split 2025-03-05 01:28:08 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_TSNE 2025-03-05 01:28:08 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_TSNE.fork0.split 2025-03-05 01:28:08 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS 2025-03-05 01:28:08 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS.fork0.split 2025-03-05 01:28:08 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_GRAPH_CLUSTERING 2025-03-05 01:28:08 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_GRAPH_CLUSTERING.fork0.join 2025-03-05 01:28:08 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_TSNE 2025-03-05 01:28:08 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_TSNE.fork0.chnk0.main 2025-03-05 01:28:08 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_UMAP 2025-03-05 01:28:08 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_UMAP.fork0.chnk0.main 2025-03-05 01:28:10 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS 2025-03-05 01:28:10 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS.fork0.chnk0.main 2025-03-05 01:28:10 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS.fork0.chnk1.main 2025-03-05 01:28:10 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS.fork0.chnk2.main 2025-03-05 01:28:10 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS.fork0.chnk3.main 2025-03-05 01:28:10 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS.fork0.chnk4.main 2025-03-05 01:28:10 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS.fork0.chnk5.main 2025-03-05 01:28:10 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS.fork0.chnk6.main 2025-03-05 01:28:10 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS.fork0.chnk7.main 2025-03-05 01:28:10 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS.fork0.chnk8.main 2025-03-05 01:28:10 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_GRAPH_CLUSTERING 2025-03-05 01:28:14 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_TSNE 2025-03-05 01:28:14 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_TSNE.fork0.join 2025-03-05 01:28:14 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_TSNE 2025-03-05 01:28:16 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS 2025-03-05 01:28:16 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS.fork0.join 2025-03-05 01:28:19 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_KMEANS 2025-03-05 01:28:19 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.COMBINE_CLUSTERING 2025-03-05 01:28:19 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.COMBINE_CLUSTERING.fork0.chnk0.main 2025-03-05 01:28:20 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.COMBINE_CLUSTERING 2025-03-05 01:28:20 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION 2025-03-05 01:28:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION.fork0.split 2025-03-05 01:28:20 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION 2025-03-05 01:28:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION.fork0.chnk0.main 2025-03-05 01:28:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION.fork0.chnk1.main 2025-03-05 01:28:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION.fork0.chnk2.main 2025-03-05 01:28:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION.fork0.chnk3.main 2025-03-05 01:28:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION.fork0.chnk4.main 2025-03-05 01:28:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION.fork0.chnk5.main 2025-03-05 01:28:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION.fork0.chnk6.main 2025-03-05 01:28:20 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION.fork0.chnk7.main 2025-03-05 01:28:31 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_UMAP 2025-03-05 01:28:31 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_UMAP.fork0.join 2025-03-05 01:28:31 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_UMAP 2025-03-05 01:28:34 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION 2025-03-05 01:28:34 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION.fork0.join 2025-03-05 01:28:34 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUBSAMPLE_READS 2025-03-05 01:28:34 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUBSAMPLE_READS.fork0.join 2025-03-05 01:28:35 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.RUN_DIFFERENTIAL_EXPRESSION 2025-03-05 01:28:35 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.SUMMARIZE_ANALYSIS 2025-03-05 01:28:35 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.SUMMARIZE_ANALYSIS.fork0.split 2025-03-05 01:28:35 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.SUMMARIZE_ANALYSIS 2025-03-05 01:28:35 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.SUMMARIZE_ANALYSIS.fork0.chnk0.main 2025-03-05 01:28:36 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUBSAMPLE_READS 2025-03-05 01:28:36 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUMMARIZE_BASIC_REPORTS 2025-03-05 01:28:36 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUMMARIZE_BASIC_REPORTS.fork0.split 2025-03-05 01:28:37 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.SUMMARIZE_ANALYSIS 2025-03-05 01:28:37 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.SUMMARIZE_ANALYSIS.fork0.join 2025-03-05 01:28:38 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER.SC_RNA_ANALYZER.SUMMARIZE_ANALYSIS 2025-03-05 01:28:38 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.GENERATE_LIBRARY_PLOTS 2025-03-05 01:28:38 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUMMARIZE_BASIC_REPORTS 2025-03-05 01:28:38 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUMMARIZE_BASIC_REPORTS.fork0.join 2025-03-05 01:28:45 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER._CELLS_REPORTER.SUMMARIZE_BASIC_REPORTS 2025-03-05 01:28:45 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.MERGE_METRICS 2025-03-05 01:28:45 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.MERGE_METRICS.fork0.chnk0.main 2025-03-05 01:28:46 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_GEM_WELL_PROCESSOR.COUNT_GEM_WELL_PROCESSOR._BASIC_SC_RNA_COUNTER.MERGE_METRICS 2025-03-05 01:28:46 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER._TARGETED_ANALYZER.CALCULATE_TARGETED_METRICS 2025-03-05 01:28:46 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER._TARGETED_ANALYZER.SUMMARIZE_TARGETED_ANALYSIS 2025-03-05 01:28:46 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER._CRISPR_ANALYZER.CALL_PROTOSPACERS 2025-03-05 01:28:46 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.SUMMARIZE_REPORTS 2025-03-05 01:28:46 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.SUMMARIZE_REPORTS.fork0.chnk0.main 2025-03-05 01:28:46 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER._CRISPR_ANALYZER._PERTURBATIONS_BY_TARGET 2025-03-05 01:28:46 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER._CRISPR_ANALYZER._PERTURBATIONS_BY_FEATURE 2025-03-05 01:28:46 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.COUNT_ANALYZER._CRISPR_ANALYZER.SUMMARIZE_CRISPR_ANALYSIS 2025-03-05 01:28:52 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.SUMMARIZE_REPORTS 2025-03-05 01:28:52 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.CLOUPE_PREPROCESS 2025-03-05 01:28:52 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.CLOUPE_PREPROCESS.fork0.split 2025-03-05 01:28:54 [runtime] (split_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.CLOUPE_PREPROCESS 2025-03-05 01:28:54 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.CLOUPE_PREPROCESS.fork0.chnk0.main 2025-03-05 01:29:06 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.CLOUPE_PREPROCESS 2025-03-05 01:29:06 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.CLOUPE_PREPROCESS.fork0.join 2025-03-05 01:29:07 [runtime] (join_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.CLOUPE_PREPROCESS 2025-03-05 01:29:07 [runtime] (ready) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.CHOOSE_CLOUPE 2025-03-05 01:29:07 [runtime] (run:local) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.CHOOSE_CLOUPE.fork0.chnk0.main 2025-03-05 01:29:07 [runtime] (chunks_complete) ID.Control-CAF.SC_RNA_COUNTER_CS.SC_MULTI_CORE.MULTI_REPORTER.CHOOSE_CLOUPE Outputs: - Run summary HTML: /anvil/scratch/x-liu2302/unit1_demo/data/cellranger/Control-CAF/outs/web_summary.html - Run summary CSV: /anvil/scratch/x-liu2302/unit1_demo/data/cellranger/Control-CAF/outs/metrics_summary.csv - BAM: /anvil/scratch/x-liu2302/unit1_demo/data/cellranger/Control-CAF/outs/possorted_genome_bam.bam - BAM index: /anvil/scratch/x-liu2302/unit1_demo/data/cellranger/Control-CAF/outs/possorted_genome_bam.bam.bai - Filtered feature-barcode matrices MEX: /anvil/scratch/x-liu2302/unit1_demo/data/cellranger/Control-CAF/outs/filtered_feature_bc_matrix - Filtered feature-barcode matrices HDF5: /anvil/scratch/x-liu2302/unit1_demo/data/cellranger/Control-CAF/outs/filtered_feature_bc_matrix.h5 - Unfiltered feature-barcode matrices MEX: /anvil/scratch/x-liu2302/unit1_demo/data/cellranger/Control-CAF/outs/raw_feature_bc_matrix - Unfiltered feature-barcode matrices HDF5: /anvil/scratch/x-liu2302/unit1_demo/data/cellranger/Control-CAF/outs/raw_feature_bc_matrix.h5 - Secondary analysis output CSV: /anvil/scratch/x-liu2302/unit1_demo/data/cellranger/Control-CAF/outs/analysis - Per-molecule read information: /anvil/scratch/x-liu2302/unit1_demo/data/cellranger/Control-CAF/outs/molecule_info.h5 - CRISPR-specific analysis: null - CSP-specific analysis: null - Loupe Browser file: /anvil/scratch/x-liu2302/unit1_demo/data/cellranger/Control-CAF/outs/cloupe.cloupe - Feature Reference: null - Target Panel File: null - Probe Set File: null Waiting 6 seconds for UI to do final refresh. Pipestance completed successfully! 2025-03-05 01:29:14 Shutting down.
We have processed all samples. All output files are saved at: "/anvil/projects/x-tra220018/current/datasets/single_cellData/Ratliff_CAF/results/Control-CAF/outs/". We will directly load the count data matrix in folder "filtered_feature_bc_matrix" for analysis with Seurat.
For the following processing and analysis steps for scRNA-seq data, we are going to use the Seurat, a popular package in R that provides the users with well curated functions and workflows. Seurat was first developed for clustering of scRNAseq data, but with continuing updates in the last few years, this package has become a popular tool for QC, analysis and exploration of scRNAseq data as well. Seurat is easily implemented and is also a very powerful analysis, with workflows well-maintained and updated regularly. For more information on Seurat, see the Seurat website from the Satija Lab, which has very nice documentation and links to the Satija Lab publications as well as detailed tutorials and vignettes.
In this notebook, we will use the most recent version of Seurat 4.0.
Before we can perform any analysis, we need to import the pre-processed data into R and set up an Seurat object. The Read10X command reads the filtered barcode matrices generated from cellranger countand returns a count matrix caf_ctrl_data. Each row of this matrix is a feature/gene and each column is a cell. This count matrix is similar to the one you generated from cellranger, except that this Read10X matrix will represent the number of unique molecules observed for each feature (gene; row) in each cell (column).
Next, we will use this count matrix to create a Seurat object, caf_ctrl, which serves as a container for data, analysis, and metadata. The count matrix is stored as caf_ctrl[["RNA"]]@counts.
In this section, we will only take the "Control" group as an example. After running the following code, we can see that the "Control" group contains 33694 genes and 3321 cells.
The following code might generate a warning message and you can ignore it.
%use r
# -------------------- Running only on Control treatment ----------------- #
# Create Seurat object
library(Seurat)
data_path="/anvil/projects/x-tra220018/current/datasets/single_cellData/Ratliff_CAF/results"
caf_ctrl_data <- Read10X(data.dir = paste0(data_path, "/Control-CAF/outs/filtered_feature_bc_matrix"))
caf_ctrl <- CreateSeuratObject(counts = caf_ctrl_data, project = "CAFCTRL")
caf_ctrl
# caf_ctrl[["RNA"]]@counts
'as(<dgTMatrix>, "dgCMatrix")' is deprecated.
Use 'as(., "CsparseMatrix")' instead.
See help("Deprecated") and help("Matrix-deprecated").
Warning message:
“Feature names cannot have underscores ('_'), replacing with dashes ('-')”
An object of class Seurat 33694 features across 3321 samples within 1 assay Active assay: RNA (33694 features, 0 variable features)
First, we will want to obtain a list of mitochondrial genes, which we will use to identify potentially stressed or damaged cells. We will do this by using the grep function, which is a pattern matching function we will use to search all the gene names for the pattern denoting mitochondrial genes. The pattern we are looking for in gene names starts with "MT-", with the "^" meaning that the pattern you are looking for appears in the beginning of the line. Note that for different genome versions, mitochondrial transcripts or genes may be specified differently. If grep finds no rownames matching the pattern specified, check the genome version and annotation you used to see how mitochondrial genes are named. Some genomes use "M" or "Mito" to specify mitochondrial genes instead. Simply search on whatever pattern is appropriate for your data. Setting "value = TRUE" specifies that you want to return a vector of the actual matching elements, rather than simply a vector of the indices of the matching elements.
Next, you will calculate the percentage of reads that match this pattern (the percentage of reads mapping to mitochondrial transcripts) using the Seurat function PercentageFeatureSet with the same pattern used to specify mitochondrial genes. You will add this information to metadata in the caf_ctrl Seurat object.
# ------ Quality Control ------ #
# Percent mitochondrial genes
mito.genes <- grep(pattern = "^MT-", x = rownames(x = caf_ctrl[["RNA"]]@counts), value = TRUE)
caf_ctrl[["percent.mito"]] <- PercentageFeatureSet(caf_ctrl, pattern = "^MT-")
Next, we will visualize the resulting data by making violin plots of specific features. In this case, we are looking at the number of nCount_RNA (the number of molecules detected in a cell), nFeature_RNA (the number of genes detected in a cell), and percent.mito, which you calculated above. Each of these measures gives you an idea of the quality of the cells in your dataset. Low nCount_RNA in a cell means the cell could be dead/dying or a droplet may have been empty. Low nFeature_RNA can likewise indicate a dead/dying cell, whereas high nFeature_RNA may indicate a doublet/multiplet. These features can be used in combination to filter the dataset to remove damaged/low quality cells/doublets.
#visualize QC metrices as violin plots
options(repr.plot.width=15, repr.plot.height=6) # set plot size
VlnPlot(object = caf_ctrl, features = c("nFeature_RNA", "nCount_RNA", "percent.mito"), ncol = 3)
Another way to visualize is using FeatureScatter function. FeatureScatter function in Seurat generates scatterplots for nCount_RNA, percent.mito, and nFeature_RNA. A balance must be struck between keeping as much data as possible, but removing possible damaged cells and multiplets. FeatureScatter is typically used to visualize feature-feature relationships, but can be used for anything calculated by the object, i.e. columns in the object metadata, PC scores, etc.
# Scatter plot for nCount_RNA, percent.mito, nFeature_RNA and filtering based on these variables
# FeatureScatter is typically used to visualize feature-feature relationships, but can be used
# for anything calculated by the object, i.e. columns in object metadata, PC scores etc.
plot1 <- FeatureScatter(caf_ctrl, feature1 = "nCount_RNA", feature2 = "percent.mito")
plot2 <- FeatureScatter(caf_ctrl, feature1 = "nCount_RNA", feature2 = "nFeature_RNA")
library(patchwork)
plot1 + plot2 +plot_layout(ncol=2)
After visualizing the QC metrics, we are ready to filter the data based on plots using the subset function. Here we use a very loose condition to simply remove dead/damaged cells as well as potential doublets/multiplets. You may adjust the parameters as you like based on the plots.
# Filter cells
caf_ctrl <- subset(caf_ctrl, subset = nFeature_RNA > 2000 &
percent.mito < 30 & nCount_RNA >10000 & nCount_RNA <90000)
save(caf_ctrl, file="./data/caf_ctrl_QC.RData")
After removing unwanted cells, we are ready to normalize the data using NormalizeData command. There are a number of methods for normalizing data in Seurat. In this example we use a common technique that Seurat employs by default, in which a global-scaling normalization method "LogNormalize" is used to normalize the gene expression measurements for each cell by total expression, multiplying this by a scaling factor and then log-transforming the result. Here we use a method equivalent to log CPM. Another common scaling factor used is 10,000. These normalized values are then stores in the caf_ctrl object.
%use r
library(Seurat)
load("./data/caf_ctrl_QC.RData")
caf_ctrl <- NormalizeData(object = caf_ctrl, normalization.method = "LogNormalize", scale.factor = 1000000)
# Normalized values are stored in caf_ctrl[["RNA"]]@data.
Performing log-normalization 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************|
Next, we will reduce the dimensions of the Seurat object by finding highly variable genes, which highlights biological signals (Brennecke et al. 2013), and focus on these for the downstream analysis. The FindVariableGenes function calculates the average expression and dispersion for every gene and places these into bins. Seurat then calculates a z-score for dispersion within each bin. This helps control for the mean-variance relationship that we discussed in the bulk RNA-seq section and is discussed in more detail in Macosko et al.
By default, 2000 variable features are selected from each dataset, which we can see using the length function in R.
%use r
# Finding highly variable genes (feature selection), we return 2000 features per dataset by default
caf_ctrl <- FindVariableFeatures(object = caf_ctrl, selection.method ="vst",
mean.cutoff = c(0.1,6), dispersion.cutoff = c(0.5,Inf),verbose=TRUE)
length(VariableFeatures(caf_ctrl))
Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************|
Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************|
A visualization of these features are generated with head function from base R. The variable features are highlighted in red in the plots and the top 10 variable features are labeled in the second plot. (There might be several warning messages and you can safely ignore them.)
%use r
library(patchwork)
# Identify the 10 most highly variable genes
top10 <- head(VariableFeatures(caf_ctrl), 10)
#top10 # check top 10 highly variable genes
# plot variable features with and without labels
plot1 <- VariableFeaturePlot(caf_ctrl)
plot2 <- LabelPoints(plot = plot1, points = top10, repel = TRUE)
plot1 + plot2 + plot_layout(ncol=2)
When using repel, set xnudge and ynudge to 0 for optimal results Warning message: “Transformation introduced infinite values in continuous x-axis” Warning message: “Transformation introduced infinite values in continuous x-axis”
Single cell datasets often contain many uninteresting sources of variation, such as technical noise, cell cycle stage, and batch effects. A major source of systemmatic bias in scRNAseq is the cell cycle, which usually introduces within-cell-type heterogeneity. This bias will obscure the difference in expression between cell types. Regressing out these signals can vastly improve downstream dimensionality reduction, clustering, and differential expression analyses.
To remove these effects, Seurat constructs linear models to predict gene expression based on variables. Then the scaled z-score residuals of these linear models are stored in the Seurat object and are used for dimensionality reduction and clustering. Before we can regress out cell cycle effects, we must determine where the cells are in the cell cycle. We do this by calculating a cell-cycle score and regress this out, as well as other potentially uninteresting sources of variation, such as the number of detected molecules per cell and the percentage of reads mapping to mitochondrial genes.
The cell cycle scores are calculated using CellCycleScoring function in Seurat, and results are saved in the object meta data.
Here, we load a list of genes associated with either the S phase or the G2/M phase of the cell cycle. These markers, originally published in (Kowalczyk et al, 2015) are loaded with Seurat. Note that although Seurat predicts the cell cycle phase of each cell, these predictions are not used in downstream data analysis. Instead, Seurat uses the quantitative cell cycle score in downstream scaling. (It might produce several warning messages, but this won't affect our results.)
%use r
# Cell cycle scoring
s.genes <- cc.genes$s.genes
g2m.genes <- cc.genes$g2m.genes
caf_ctrl <- CellCycleScoring(caf_ctrl, s.features = s.genes, g2m.features = g2m.genes,
set.ident = F)
head(caf_ctrl@meta.data)
unique(caf_ctrl@meta.data$Phase)
Warning message: “The following features are not present in the object: MLF1IP, not searching for symbol synonyms”
| orig.ident | nCount_RNA | nFeature_RNA | percent.mito | S.Score | G2M.Score | Phase | |
|---|---|---|---|---|---|---|---|
| <fct> | <dbl> | <int> | <dbl> | <dbl> | <dbl> | <chr> | |
| AAACCTGAGACGCTTT-1 | CAFCTRL | 14952 | 3489 | 21.421883 | -0.21850143 | 0.1015109 | G2M |
| AAACCTGAGGGTATCG-1 | CAFCTRL | 26371 | 5111 | 9.408062 | 0.07548027 | -0.1098930 | S |
| AAACCTGGTCTAGTCA-1 | CAFCTRL | 49808 | 6210 | 4.051558 | -0.08005434 | -0.1029985 | G1 |
| AAACCTGGTGATGTGG-1 | CAFCTRL | 48327 | 5831 | 3.041778 | 0.66149059 | 1.3811409 | G2M |
| AAACCTGGTTATCACG-1 | CAFCTRL | 69105 | 6968 | 5.348383 | 0.20619971 | -0.2843568 | S |
| AAACCTGGTTGGTTTG-1 | CAFCTRL | 21576 | 4376 | 4.718205 | 0.09631241 | -0.1565114 | S |
Before performing dimension reduction method such as PCA, we need to adjust the cell cycle and remove any unwanted source of variation in expression data. We will add the cell cycle score to meta data in the caf_ctrl Seurat object (note that there are two different ways we can add information to meta data!) and then we use the function ScaleData to regress out the features that we have decided could introduce uninteresting sources of variability.The goal of this step is to:
By performing scaling, we adjust the weight of each gene such that genes with large counts won't dominate low-expressed genes in downstream analysis.
%use r
caf_ctrl@meta.data$CC.Difference <- caf_ctrl@meta.data$S.Score - caf_ctrl@meta.data$G2M.Score
caf_ctrl <- ScaleData(object = caf_ctrl, vars.to.regress = c("nCount_RNA", "percent.mito", "CC.Difference"))
Regressing out nCount_RNA, percent.mito, CC.Difference Centering and scaling data matrix
Always remember to save the intermediate Seurat object to avoid rerunning previous steps.
save(caf_ctrl, file="./data/caf_ctrl_norm.RData")
Now, we will perform PCA on the scaled data. Running dimensionality reduction on the highly variable genes can improve performance, though in general, PCA tends to return similar results with UMI data when run on all genes or when run only on the highly variable subset of genes. Thus in general, most now run dimensionality reduction and subsequent clustering on the subset of highly variable genes, in order to reduce computational resources and time needed in the analysis as well as to highlight biological signal.
PCA will be performed on the scaled data with RunPCA function. Several visualization methods are shown below using Dimplot, VizDimReduction and DimHeatmap.
library(Seurat)
load("./data/caf_ctrl_norm.RData")
caf_ctrl <- RunPCA(object = caf_ctrl, verbose=F)
DimPlot(object = caf_ctrl)
Here we visualize the top genes associated with reduction components for the first 2 PCs.
VizDimLoadings(object = caf_ctrl, dims = 1:2)
DimHeatmap is a useful way to identify the primary sources of heterogeneity, and is often used to determine how many PCs to include in downstream analysis. Here, we generate the heatmap with the first 6 PCs. The cells and genes are sorted by their principal component scores and the heatmaps allow us to visualize the heterogeneity in the data. By setting cells=500, we are plotting the 500 most extreme cells on both ends of the spectrum.
DimHeatmap(object = caf_ctrl, dims = 1:9, cells = 500, balanced = TRUE)
Next, we want to determine the dimensionality of the data. This will allow us to determine how many dimensions we want to use in downstream analyses, as not all the dimensions are likely to be important. The JackStraw function randomly permutes a subset of the data (by default 1%) and calculates projected PCA scores for these random genes. Next, we can compare the PCA scores for this null distribution of random genes with the PCA scores from the observed data to allow us to determine statistical significance and to calculate a p-value for each gene's association with each principal component. The ScoreJackStraw function computes JackStraw Scores significance. Basically, significant PCs are expected to show a p-value distribution that is strongly scored to the left when compared with the null distribution. The p-value for each PC is based on a proportion test that compares the number of features with a p-value below a threshold (<1e-05), compared with the proportion of features expected under a uniform distribution of p-values. Lastly, we plot these results.
caf_ctrl <- JackStraw(object = caf_ctrl, num.replicate = 100, verbose = FALSE)
caf_ctrl <- ScoreJackStraw(caf_ctrl, dims = 1:20)
JackStrawPlot(object = caf_ctrl, dims = 1:20)
Warning message: “Removed 28000 rows containing missing values (`geom_point()`).”
Alternatively, we can also determine which PCs to include by looking at 'Elbow plot'. It ranks principal components by the percentage of the variance explained by each one. We observe an "elbow" at the point where the PCs after capture less of the variation seen in the data (around PC 8 or 9 here). In practice, it can be difficult to determine the number of PCs to use in downstream analysis, however, if you run the later codes with various numbers of PCs (8,9, 30...) you will see that generally, the results do not differ much.
ElbowPlot(object = caf_ctrl)
Choosing the correct number of dimensions can be challenging. We only choose the first 10 PCs here, but we encourage you to repeat the clustering and downstream analysis with different numbers of PCs.
The next step in the analysis is to use unsupervised clustering to group cells into clusters (groups) of similar cells. Seurat uses a graph-based clustering technique built upon (Macosko et al, 2015) as well as (Xu & Su, 2015) and (Levine et al, 2015). Cells are embedded in a graph structure, in the default case using a K-nearest neighbor (KNN) graph, with edges connecting cells that have similar gene expression patterns, partitioning the group into highly connected communities. This step of making a KNN graph based on the first 10 PCs, in this case, is performed in the FindNeighbors function. Next cells are clustered using the Louvain algorithm by default to group cells that are similar together in the FindClusters function, with resolution setting the parameters used in this clustering function. It is recommended that user try multiple different resolution settings. Higher resolution leads to more clusters and in general, you will usually either over or under cluster your data. If on the side of overclustering data, we can decrease the resolution and then eventually merge clusters together that are similar, if need-be. Generally, the best resolution to ensure you don't under-cluster your data will be to use increased resolution for datasets of increasing size.
# cluster the cells
caf_ctrl <- FindNeighbors(caf_ctrl, reduction="pca", dims = 1:10, verbose = F, force.recalc = T)
caf_ctrl <- FindClusters(object = caf_ctrl, resolution = 0.4)
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck Number of nodes: 3199 Number of edges: 100595 Running Louvain algorithm... Maximum modularity in 10 random starts: 0.7737 Number of communities: 7 Elapsed time: 0 seconds
Next, we will visualize the clusters we just identified by the earlier clustering analysis using non-linear dimensional reduction techniques. Seurat offers several non-linear dimension reduction and visualization methods, such as tSNE and UMAP. tSNE and UMAP were designed to preserve global structure and group nearby data together, and to provide informative visualization of this heterogeneity.
It is not advised to cluster on tSNE components, however it is a powerful visualization technique. As input to RunTSNE function, use the same dimensions you used as input into the clustering functions. The tSNE algorithm will place cells with similar neighborhoods in the graph embedding into similar locations in low dimensional space. This will allow you to visualize the high dimensional clustering that you did earlier in 2D. UMAP is another similar technique that you can likewise use, which is faster and some argue maintains the global structure of the data better in low dimensional space, though others argue that this mainly has to do with parameter setting chosen when running the reduction.
caf_ctrl <- RunTSNE(object = caf_ctrl, dims = 1:10, do.fast = TRUE)
DimPlot(object = caf_ctrl, reduction="tsne")
caf_ctrl <- RunUMAP(object = caf_ctrl, dims = 1:10,min.dist=0.01,spread=3)
DimPlot(object = caf_ctrl, reduction="umap")
Warning message: “The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation' This message will be shown once per session” 00:53:09 UMAP embedding parameters a = 0.3356 b = 0.7939 00:53:09 Read 3199 rows and found 10 numeric columns 00:53:09 Using Annoy for neighbor search, n_neighbors = 30 00:53:09 Building Annoy index with metric = cosine, n_trees = 50 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * | 00:53:10 Writing NN index file to temp file /tmp/RtmpP8q2j1/file24643c670c4278 00:53:10 Searching Annoy index using 1 thread, search_k = 3000 00:53:11 Annoy recall = 100% 00:53:11 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30 00:53:12 Initializing from normalized Laplacian + noise (using irlba) 00:53:12 Commencing optimization for 500 epochs, with 124700 positive edges 00:53:15 Optimization finished
Save the clustered data into a new RData file.
save(caf_ctrl, file="./data/caf_ctrl_clustered.RData")
There are many different methods to identify differentially expressed genes. Here, we will show you two different ways to identify cluster biomarkers (differentially expressed genes that differentiate various clusters). One is edgeR (same as in bulk RNA-seq), and another is from Seurat, who also provides its own method for finding markers.
In the first method, we will use FindMarkers to run a nonparametric Wilxcoxon test to identify biomarkers. The min.pct argument requires a gene to be detected at a minimum percentage in either of the two cell clusters. Below, ident.1 argument indicates the cluster ID, and this specifies that we want to compare cluster X with all other clusters. This will identify genes which are different between cluster X and all other clusters. In the code below, many of the lines are commented out, because each function takes a few minutes to finish. You can just run one of them as an example.
Below we start by finding genes that are differentially expressed between cluster 0 and all other clusters. If you only want to identify genes that are upregulated (high) in cluster 0 compared to other clusters, you can add the argument "only.pos = TRUE", however here we want to identify both genes that are high and low in cluster 0 compared to the other clusters.
You can compare other clusters using similar codes. The codes for other clusters were commented out below to save running time.
library(Seurat)
load("./data/caf_ctrl_clustered.RData")
# Find Markers, each will take a few minutes to finish, so we only run first one as an example
# Cluster 0
cluster0.markers <- FindMarkers(object = caf_ctrl, ident.1 = 0, min.pct = 0.1)
print(x = head(x = cluster0.markers, n = 5))
# Cluster 1
# cluster1.markers <- FindMarkers(object = caf_ctrl, ident.1 = 1, min.pct = 0.1)
# print(x = head(x = cluster1.markers, n = 5))
# Cluster 2
# cluster2.markers <- FindMarkers(object = caf_ctrl, ident.1 = 2, min.pct = 0.1)
# print(x = head(x = cluster2.markers, n = 5))
# Cluster 3
# cluster3.markers <- FindMarkers(object = caf_ctrl, ident.1 = 3, min.pct = 0.1)
# print(x = head(x = cluster3.markers, n = 5))
# Cluster 4
# cluster4.markers <- FindMarkers(object = caf_ctrl, ident.1 = 4, min.pct = 0.1)
# print(x = head(x = cluster4.markers, n = 5))
# Cluster 5
# cluster5.markers <- FindMarkers(object = caf_ctrl, ident.1 = 5, min.pct = 0.1)
# print(x = head(x = cluster5.markers, n = 5))
p_val avg_log2FC pct.1 pct.2 p_val_adj CALM2 9.565361e-58 0.2933801 1.000 1.000 3.222953e-53 NNMT 1.518187e-56 0.4385994 0.998 0.992 5.115379e-52 TIMP2 1.037616e-55 -0.5730115 0.989 0.990 3.496144e-51 GLRX 1.584119e-55 0.5557841 0.993 0.984 5.337529e-51 NQO1 3.415059e-51 0.4186482 1.000 0.988 1.150670e-46
In the code below, we can also use FindAllMarkers function to automate this process for all clusters. This compares 0 with all other clusters, 1 with all other clusters, etc. Note here that we have specified that min.pct =0.25 , as we want the log(fold-change) threshold for a gene to be at least 0.25 to be identified as significant. Then we select the top 15 markers for each cluster, based on log(fold-changes, but it will take long time to finish.
# Find all Markers, this code will take a long time to complete
caf_ctrl_markers <- FindAllMarkers(object = caf_ctrl, only.pos = FALSE, min.pct = 0.25,
logfc.threshold = 0.25, slot='scale.data')
head(caf_ctrl_markers)
Calculating cluster 0 Calculating cluster 1 Calculating cluster 2 Calculating cluster 3 Calculating cluster 4 Calculating cluster 5 Calculating cluster 6
| p_val | avg_diff | pct.1 | pct.2 | p_val_adj | cluster | gene | |
|---|---|---|---|---|---|---|---|
| <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <fct> | <chr> | |
| GLRX | 8.137034e-62 | 0.5445789 | 0.760 | 0.449 | 1.627407e-58 | 0 | GLRX |
| GCLM | 1.063910e-59 | 0.5419747 | 0.792 | 0.513 | 2.127820e-56 | 0 | GCLM |
| CALM2 | 6.267483e-58 | 0.5732416 | 0.741 | 0.429 | 1.253497e-54 | 0 | CALM2 |
| SBF2-AS1 | 5.351493e-53 | 0.5145232 | 0.810 | 0.557 | 1.070299e-49 | 0 | SBF2-AS1 |
| NNMT | 6.714445e-53 | 0.5136216 | 0.762 | 0.464 | 1.342889e-49 | 0 | NNMT |
| NQO1 | 7.784938e-51 | 0.5168075 | 0.752 | 0.449 | 1.556988e-47 | 0 | NQO1 |
library(dplyr)
top15_logFC <- caf_ctrl_markers %>% group_by(cluster) %>% top_n(15, avg_diff)
top15_logFC
| p_val | avg_diff | pct.1 | pct.2 | p_val_adj | cluster | gene |
|---|---|---|---|---|---|---|
| <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <fct> | <chr> |
| 8.137034e-62 | 0.5445789 | 0.760 | 0.449 | 1.627407e-58 | 0 | GLRX |
| 1.063910e-59 | 0.5419747 | 0.792 | 0.513 | 2.127820e-56 | 0 | GCLM |
| 6.267483e-58 | 0.5732416 | 0.741 | 0.429 | 1.253497e-54 | 0 | CALM2 |
| 5.351493e-53 | 0.5145232 | 0.810 | 0.557 | 1.070299e-49 | 0 | SBF2-AS1 |
| 6.714445e-53 | 0.5136216 | 0.762 | 0.464 | 1.342889e-49 | 0 | NNMT |
| 7.784938e-51 | 0.5168075 | 0.752 | 0.449 | 1.556988e-47 | 0 | NQO1 |
| 2.067844e-49 | 0.4926607 | 0.790 | 0.537 | 4.135688e-46 | 0 | TNFRSF12A |
| 3.482508e-48 | 0.5056353 | 0.784 | 0.565 | 6.965015e-45 | 0 | STC2 |
| 1.026793e-46 | 0.5005518 | 0.724 | 0.422 | 2.053586e-43 | 0 | PRDX1 |
| 2.158916e-44 | 0.4865853 | 0.734 | 0.477 | 4.317832e-41 | 0 | TXNRD1 |
| 3.032248e-42 | 0.5090307 | 0.714 | 0.466 | 6.064496e-39 | 0 | PFN1 |
| 9.318796e-41 | 0.4768871 | 0.685 | 0.438 | 1.863759e-37 | 0 | TPM1 |
| 1.072448e-40 | 0.4723494 | 0.790 | 0.538 | 2.144897e-37 | 0 | TNFRSF11B |
| 4.503110e-40 | 0.4728337 | 0.726 | 0.458 | 9.006220e-37 | 0 | CFL1 |
| 8.663760e-40 | 0.4687297 | 0.797 | 0.571 | 1.732752e-36 | 0 | IGFBP3 |
| 6.320753e-143 | 0.8086531 | 0.837 | 0.393 | 1.264151e-139 | 1 | OST4 |
| 6.774061e-104 | 0.6259537 | 0.845 | 0.471 | 1.354812e-100 | 1 | CAMK2N1 |
| 2.572495e-94 | 0.7583933 | 0.773 | 0.406 | 5.144990e-91 | 1 | MT-ND4 |
| 4.137869e-86 | 0.6795128 | 0.782 | 0.467 | 8.275739e-83 | 1 | FSTL1 |
| 4.404964e-85 | 0.7392704 | 0.747 | 0.429 | 8.809928e-82 | 1 | TMSB10 |
| 2.026055e-77 | 0.5616456 | 0.823 | 0.523 | 4.052111e-74 | 1 | IGF2 |
| 3.007625e-71 | 0.5847735 | 0.785 | 0.471 | 6.015250e-68 | 1 | TIMP2 |
| 2.074076e-60 | 0.5727921 | 0.792 | 0.531 | 4.148153e-57 | 1 | NABP1 |
| 6.887300e-56 | 0.6209232 | 0.682 | 0.441 | 1.377460e-52 | 1 | SELM |
| 2.792222e-55 | 0.5530528 | 0.710 | 0.446 | 5.584444e-52 | 1 | POLR2L |
| 1.101790e-54 | 0.6073151 | 0.693 | 0.419 | 2.203580e-51 | 1 | TMSB4X |
| 7.168131e-47 | 0.6201773 | 0.680 | 0.463 | 1.433626e-43 | 1 | FTL |
| 9.000951e-45 | 0.5508612 | 0.819 | 0.609 | 1.800190e-41 | 1 | MRVI1 |
| 8.235448e-43 | 0.5424776 | 0.809 | 0.588 | 1.647090e-39 | 1 | USP53 |
| 1.996406e-30 | 0.5337986 | 0.582 | 0.319 | 3.992812e-27 | 1 | SCUBE3 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 2.652816e-36 | 4.544904 | 1.000 | 0.299 | 5.305632e-33 | 5 | MKI67 |
| 3.632837e-34 | 4.565972 | 0.982 | 0.303 | 7.265674e-31 | 5 | CEP55 |
| 5.762209e-34 | 4.282521 | 0.982 | 0.319 | 1.152442e-30 | 5 | GTSE1 |
| 6.966664e-28 | 4.572562 | 0.929 | 0.306 | 1.393333e-24 | 5 | DEPDC1 |
| 4.136647e-27 | 4.535806 | 0.929 | 0.370 | 8.273294e-24 | 5 | KIF2C |
| 7.251863e-24 | 4.939070 | 0.893 | 0.338 | 1.450373e-20 | 5 | SPC25 |
| 1.163520e-22 | 4.535190 | 0.875 | 0.252 | 2.327039e-19 | 5 | CDCA8 |
| 2.392881e-20 | 4.379211 | 0.857 | 0.273 | 4.785761e-17 | 5 | KIFC1 |
| 6.460913e-19 | 4.278136 | 0.839 | 0.318 | 1.292183e-15 | 5 | NUF2 |
| 1.353696e-18 | 4.353691 | 0.839 | 0.290 | 2.707392e-15 | 5 | SGOL1 |
| 7.217916e-15 | 5.339904 | 0.804 | 0.266 | 1.443583e-11 | 5 | ESCO2 |
| 2.771655e-13 | 4.302038 | 0.786 | 0.164 | 5.543310e-10 | 5 | PKMYT1 |
| 2.998486e-12 | 4.369138 | 0.768 | 0.169 | 5.996972e-09 | 5 | ASF1B |
| 1.713129e-05 | 4.548687 | 0.607 | 0.025 | 3.426258e-02 | 5 | DTL |
| 3.868072e-03 | 4.385763 | 0.571 | 0.089 | 1.000000e+00 | 5 | MCM10 |
| 4.729120e-24 | 2.493509 | 1.000 | 0.525 | 9.458240e-21 | 6 | S100A11 |
| 1.251817e-23 | 2.440120 | 1.000 | 0.505 | 2.503634e-20 | 6 | TMSB10 |
| 2.780704e-21 | 2.084762 | 1.000 | 0.508 | 5.561408e-18 | 6 | POLR2L |
| 6.562813e-21 | 1.564547 | 1.000 | 0.575 | 1.312563e-17 | 6 | COTL1 |
| 7.172092e-21 | 1.598519 | 0.974 | 0.537 | 1.434418e-17 | 6 | SEC61G |
| 3.712187e-20 | 2.030556 | 1.000 | 0.483 | 7.424375e-17 | 6 | TMSB4X |
| 6.169797e-20 | 1.715076 | 0.974 | 0.529 | 1.233959e-16 | 6 | SH3BGRL3 |
| 2.445759e-18 | 2.235110 | 0.921 | 0.450 | 4.891518e-15 | 6 | MT2A |
| 2.484284e-18 | 1.844152 | 0.947 | 0.526 | 4.968567e-15 | 6 | PFN1 |
| 3.866244e-17 | 1.622470 | 0.974 | 0.526 | 7.732487e-14 | 6 | C12orf75 |
| 9.787282e-17 | 1.898866 | 0.921 | 0.481 | 1.957456e-13 | 6 | TPM2 |
| 4.953742e-16 | 1.665525 | 0.947 | 0.465 | 9.907485e-13 | 6 | ANXA2 |
| 7.889069e-16 | 1.602573 | 0.921 | 0.497 | 1.577814e-12 | 6 | MYL12A |
| 3.524484e-07 | 1.639189 | 0.553 | 0.215 | 7.048967e-04 | 6 | BIRC5 |
| 2.001293e-03 | 1.548910 | 0.421 | 0.207 | 1.000000e+00 | 6 | TRIP13 |
In this example, DoHeatmap function generates an expression heatmap for given cells and feature. Here, we plot using the top15 markers in top15_logFC, and save it as Cluster_heatmap_ctrl.png in folder Figure.
Then we save the Seurat object caf_ctrl to a .RData file, which we can later load. This will allow us to save the analyses we have done so far, so that we do not need to redo them again. Later we can simply use the load() function in R to load the same results again.
# generate heatmap
heatmap_ctrl <- DoHeatmap(object = caf_ctrl, features = top15_logFC$gene)+ NoLegend()
heatmap_ctrl
#save plot
png(file = "./Figures/Cluster_heatmap_ctrl.png", width = 1024, height = 768)
print(heatmap_ctrl)
dev.off()
#save marker results
write.csv(caf_ctrl_markers, file = "./data/cluster_markers_ctrl.csv")
The below code shows you how to access data in various ways, for example saving the names of cells that are in cluster 0, 3, and then 1. Next you save a gene list of some of your favorite genes in the variable gene_list, and use this to export normalized gene expression values for these genes to CSV files for cluster 0, 3, and 1.
# Find cells in cluster 0 and 3
clstr_0 <- names(caf_ctrl@active.ident[caf_ctrl@active.ident == 0])
clstr_3 <- names(caf_ctrl@active.ident[caf_ctrl@active.ident == 3])
clstr_1 <- names(caf_ctrl@active.ident[caf_ctrl@active.ident == 1])
gene_list <- c("EBP, FDFT1, LSS, MSMO1, SQLE, DHCR7, DHCR24, TM7SF2")
gene_list <- strsplit(gene_list, ",")[[1]]
gene_list <- gsub(" ", "", gene_list)
exp_mat_cls_0 <- caf_ctrl[["RNA"]]@scale.data[gene_list, clstr_0]
exp_mat_cls_3 <- caf_ctrl[["RNA"]]@scale.data[gene_list, clstr_3]
exp_mat_cls_1 <- caf_ctrl[["RNA"]]@scale.data[gene_list, clstr_1]
write.csv(exp_mat_cls_0, file = "./data/cluster0_norm_expr.csv")
write.csv(exp_mat_cls_3, file = "./data/cluster3_norm_expr.csv")
write.csv(exp_mat_cls_1, file = "./data/cluster1_norm_expr.csv")
Similar to bulkRNA-seq, we can perform DE using the edgeR package. The program edgeR, while developed for bulk RNA-seq data, works very well in practice for single-cell data also, provided the dataset is not too large, at which point memory issues become a problem. Here, we simply want to give you a bit more practice with edgeR, and you can compare the results with what we got from Seurat. We only take cluster 0 and 3 as an example here.
The following code takes about 5 minutes to run.
library(edgeR)
load("./data/caf_ctrl_clustered.RData")
# we only subset 2 clusters here
# Find cells in cluster 0 and 3
clstr_0 <- names(caf_ctrl@active.ident[caf_ctrl@active.ident == 0])
clstr_3 <- names(caf_ctrl@active.ident[caf_ctrl@active.ident == 3])
caf_sub <- caf_ctrl[, j=c(clstr_0, clstr_3)]
counts <- caf_sub[["RNA"]]@counts
group <- caf_sub@meta.data$seurat_clusters
# build dge subject
dge <- DGEList(counts = counts,
norm.factors = rep(1, ncol(counts)),
group = group)
group_edgeR <- factor(group)
design <- model.matrix(~ group_edgeR)
dge <- estimateGLMCommonDisp(dge, design = design)
fit <- glmFit(dge, design)
res <- glmLRT(fit)
pVals <- res$table[,4]
names(pVals) <- rownames(res$table)
pVals <- p.adjust(pVals, method = "fdr")
head(as.data.frame(sort(pVals)),n=30)
| sort(pVals) | |
|---|---|
| <dbl> | |
| PTGDS | 0.000000e+00 |
| HSP90AA1 | 0.000000e+00 |
| UBC | 8.113929e-290 |
| MGP | 8.998845e-284 |
| ACTA2 | 1.457681e-268 |
| APOE | 2.315670e-253 |
| CTSK | 5.226153e-248 |
| ITM2B | 1.105687e-237 |
| FDPS | 8.370284e-209 |
| CD63 | 3.255205e-202 |
| C1R | 2.398328e-188 |
| SEPT7 | 1.637588e-184 |
| PCOLCE | 3.881557e-180 |
| ACTG2 | 5.150734e-177 |
| TUBA1A | 3.995328e-174 |
| C1S | 3.359145e-172 |
| CD9 | 3.431531e-172 |
| HSP90AB1 | 6.449721e-171 |
| HLA-A | 5.632220e-169 |
| LAPTM4A | 6.260576e-164 |
| FABP3 | 1.897097e-160 |
| TMEM176B | 1.205865e-159 |
| TFPI2 | 3.239466e-156 |
| CLU | 2.488584e-155 |
| FBLN1 | 4.415904e-151 |
| LGALS3BP | 3.465907e-147 |
| GPNMB | 9.587354e-144 |
| AKAP12 | 4.262962e-134 |
| CTSL | 7.907964e-134 |
| SERPINF1 | 1.720762e-130 |
Many times, we do not just have one single-cell dataset to analyze, but we have many. While it can be useful to analyze each separately, it is also necessary many times to integrate the datasets together.
The Seurat package introduces a new method to integrate multiple datasets together, even if they are collected from different individuals, environmental conditions. Datasets are mixed and harmonized together using identified 'anchors' that represent the pairwise correlation between cells. Details of the integration method can be found in this paper.
Next we will use what we have already learned and analyze three datasets together, eventually clustering them together. The data here are the control (untreated) CAF cells we worked with earlier, DHT (dihydrotestosterone) treated CAFs and E2 (estradiol) treated CAFs.
Like what we did for "Control" group, we are going to read in data and construct Seurat object for each group.
The code below each takes 5 minutes to run.
library(Seurat)
# ******************** Merge data from all experiments using Seurat ********************** #
data_path="/anvil/projects/x-tra220018/current/datasets/single_cellData/Ratliff_CAF/results"
caf_ctrl_data <- Read10X(data.dir = paste0(data_path, "/Control-CAF/outs/filtered_feature_bc_matrix"))
caf_ctrl <- CreateSeuratObject(counts = caf_ctrl_data, project = "CAFCTRL")
caf_dht_data <- Read10X(data.dir = paste0(data_path, "/DHT-CAF/outs/filtered_feature_bc_matrix"))
caf_dht <- CreateSeuratObject(counts = caf_dht_data, project = "CAFDHT")
caf_e2_data <- Read10X(data.dir = paste0(data_path, "/E2-CAF/outs/filtered_feature_bc_matrix"))
caf_e2 <- CreateSeuratObject(counts = caf_e2_data, project = "CAFE2")
Warning message:
“Feature names cannot have underscores ('_'), replacing with dashes ('-')”
Warning message:
“Feature names cannot have underscores ('_'), replacing with dashes ('-')”
Warning message:
“Feature names cannot have underscores ('_'), replacing with dashes ('-')”
# Combine data
caf_combine <- merge(x = caf_ctrl, y = c(caf_dht, caf_e2), merge.data=TRUE,
add.cell.ids = c("ctrl", c("dht", "e2")) , project = "CAF1")
#get count matrix
caf_exprs_mat <- caf_combine[["RNA"]]@counts
# save data
save(caf_combine, file = "./data/caf_combine.RData")
save(caf_exprs_mat, file = "./data/caf_exprs_mat.RData")
Here, we load Seurat and our combined object, then split the dataset into a list describing the original identity of each dataset (condition). This will allow us to use functions in R to process the datasets, cutting down on the lines of code we write. We will address these goals for integration analysis:
library(cowplot)
load("./data/caf_combine.RData")
caf.list <- SplitObject(caf_combine, split.by = "orig.ident")
caf.list
$CAFCTRL An object of class Seurat 33694 features across 3321 samples within 1 assay Active assay: RNA (33694 features, 0 variable features) $CAFDHT An object of class Seurat 33694 features across 4166 samples within 1 assay Active assay: RNA (33694 features, 0 variable features) $CAFE2 An object of class Seurat 33694 features across 4932 samples within 1 assay Active assay: RNA (33694 features, 0 variable features)
We will process the data just as we had previously, using the functions NormalizeData and ScaleData, this time within a function to process all three datasets.
# -------- standard pre-processing and identify features ----------- #
caf.list <- lapply(X = caf.list, FUN = function(x) {
x <- NormalizeData(x)
x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 2000)
})
Now, we will perform an integrated analysis, where we use methods described in Stuart et al, 2019 to integrate the three datasets together, performing clustering analyses on all three datasets combined. When clustering multiple datasets together, we want a method that preserves the individual features of each dataset, while correcting for batch effects and allowing identification of shared features. The present method is a marked improvement over previous methods, which tended to overcorrect for differences between datasets.
The FindIntegrationAnchors function identifies a set of anchors that can be used later to co-cluster the datasets.
The following codes will take several minutes to complete. It will also generate multiple messages as it runs.
# ---------- Integrate datasets with identified anchors -------- #
caf.anchors <- FindIntegrationAnchors(object.list = caf.list, dims = 1:20)
Computing 2000 integration features Scaling features for provided objects Finding all pairwise anchors Running CCA Merging objects Finding neighborhoods Finding anchors Found 11176 anchors Filtering anchors Retained 7559 anchors Running CCA Merging objects Finding neighborhoods Finding anchors Found 11802 anchors Filtering anchors Retained 7660 anchors Running CCA Merging objects Finding neighborhoods Finding anchors Found 13383 anchors Filtering anchors Retained 9398 anchors
We now run the function IntegrateData in Seurat, which combines the data, using the precomputed anchor set.
caf.integrate <- IntegrateData(anchorset = caf.anchors, dims = 1:20)
Merging dataset 1 into 3 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights Integrating data Merging dataset 2 into 3 1 Extracting anchors for merged samples Finding integration vectors Finding integration vector weights Integrating data
Now we will simply run the standard workflow on the combined dataset, re-normalizing data, running PCA, finding, neighbors, finding clusters, and perfoming dimension reduction and visualization. Analysis and visualization methods are similar as in single datasetas well, except that now we are looking at a much larger integrated data so that the analysis is performed on all cells. This time we'll also use UMAP instead of tSNE for dimension reduction.
The codes below each will take several minutes to run.
# ------------ Run a single integrated analysis on all cells ------------ #
DefaultAssay(caf.integrate) <- "integrated"
# Run the standard workflow for visualization and clustering
caf.integrate <- ScaleData(caf.integrate, verbose = FALSE)
caf.integrate <- RunPCA(caf.integrate, npcs = 30, verbose = FALSE)
# t-SNE and Clustering
caf.integrate <- RunUMAP(caf.integrate, reduction = "pca", dims = 1:20)
caf.integrate <- FindNeighbors(caf.integrate, reduction = "pca", dims = 1:20)
caf.integrate <- FindClusters(caf.integrate, resolution = 0.5)
save(caf.integrate, file = "./data/caf.integrate.RData")
00:59:15 UMAP embedding parameters a = 0.9922 b = 1.112 00:59:15 Read 12419 rows and found 20 numeric columns 00:59:15 Using Annoy for neighbor search, n_neighbors = 30 00:59:15 Building Annoy index with metric = cosine, n_trees = 50 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * | 00:59:17 Writing NN index file to temp file /tmp/RtmpP8q2j1/file24643c1637eb83 00:59:17 Searching Annoy index using 1 thread, search_k = 3000 00:59:20 Annoy recall = 100% 00:59:21 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30 00:59:22 Initializing from normalized Laplacian + noise (using irlba) 00:59:22 Commencing optimization for 200 epochs, with 516642 positive edges 00:59:27 Optimization finished Computing nearest neighbor graph Computing SNN
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck Number of nodes: 12419 Number of edges: 419488 Running Louvain algorithm... Maximum modularity in 10 random starts: 0.8443 Number of communities: 12 Elapsed time: 1 seconds
# Visualization
DimPlot(caf.integrate, reduction = "umap", label = TRUE)
By adding an argument group.by or split.by, we can also visualize all 3 groups or split plots by group.
DimPlot(caf.integrate, reduction = "umap", group.by = "orig.ident")
DimPlot(caf.integrate, reduction = "umap", split.by = "orig.ident", ncol=3)
FindConservedMarkers function performs differential expression analysis for each group, and combine the p-values using meta-analysis method from MetaDE package.
In this example, we'll use cluster 0 as an example to find DE markers, which could be target markers that differentiate cell type 0 from all other cells. The cluster ID is specified with the ident.1 argument.
library(MetaDE)
library(metap)
# eg. in cluster 0
load("data/caf.integrate.RData")
DefaultAssay(caf.integrate) <- "RNA"
nk.markers <- FindConservedMarkers(caf.integrate, ident.1 = 0, grouping.var = "orig.ident", verbose = TRUE)
head(nk.markers)
Testing group CAFCTRL: (0) vs (4, 8, 6, 2, 7, 3, 10, 5, 9, 1, 11) Testing group CAFDHT: (0) vs (9, 1, 7, 8, 5, 2, 6, 4, 11, 3, 10) Testing group CAFE2: (0) vs (1, 9, 3, 6, 5, 2, 8, 7, 4, 10, 11)
| CAFCTRL_p_val | CAFCTRL_avg_log2FC | CAFCTRL_pct.1 | CAFCTRL_pct.2 | CAFCTRL_p_val_adj | CAFDHT_p_val | CAFDHT_avg_log2FC | CAFDHT_pct.1 | CAFDHT_pct.2 | CAFDHT_p_val_adj | CAFE2_p_val | CAFE2_avg_log2FC | CAFE2_pct.1 | CAFE2_pct.2 | CAFE2_p_val_adj | max_pval | minimump_p_val | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | |
| NQO1 | 9.361404e-57 | 0.4115954 | 0.999 | 0.988 | 3.154232e-52 | 2.573931e-67 | 0.4215092 | 0.998 | 0.960 | 8.672602e-63 | 3.375216e-103 | 0.5730114 | 0.997 | 0.972 | 1.137245e-98 | 9.361404e-57 | 1.012565e-102 |
| PRDX1 | 2.160477e-58 | 0.3049560 | 1.000 | 1.000 | 7.279511e-54 | 4.667434e-60 | 0.2868176 | 1.000 | 0.998 | 1.572645e-55 | 2.652461e-100 | 0.3930062 | 1.000 | 0.999 | 8.937201e-96 | 2.160477e-58 | 7.957382e-100 |
| NNMT | 1.076977e-67 | 0.4227141 | 1.000 | 0.989 | 3.628767e-63 | 3.189462e-66 | 0.3712696 | 1.000 | 0.970 | 1.074657e-61 | 2.509703e-99 | 0.4596493 | 1.000 | 0.977 | 8.456192e-95 | 3.189462e-66 | 7.529108e-99 |
| GAPDH | 1.439722e-58 | 0.2846855 | 1.000 | 1.000 | 4.851000e-54 | 6.928142e-78 | 0.2838060 | 1.000 | 1.000 | 2.334368e-73 | 9.438581e-95 | 0.2952283 | 1.000 | 1.000 | 3.180235e-90 | 1.439722e-58 | 2.831574e-94 |
| GLRX | 7.643490e-64 | 0.5499935 | 0.999 | 0.979 | 2.575397e-59 | 1.014001e-71 | 0.5312733 | 0.998 | 0.961 | 3.416574e-67 | 4.764570e-85 | 0.5220748 | 0.999 | 0.965 | 1.605374e-80 | 7.643490e-64 | 1.429371e-84 |
| PTGR1 | 1.687555e-37 | 0.3379690 | 0.996 | 0.967 | 5.686047e-33 | 2.197870e-44 | 0.3264536 | 0.987 | 0.922 | 7.405502e-40 | 1.357667e-81 | 0.5171790 | 0.985 | 0.933 | 4.574523e-77 | 1.687555e-37 | 4.073001e-81 |
We can also explore the distribution of marker genes that differentiate each cluster. By repeating the code above and changing the ident.1 from 0-11, we are able to get lists of markers for each cluster. Below, we only select the most significant marker genes from each of the clusters to visualize. You can also visualize any gene of your interest.
# explore marker genes for each cluster (first 9 clusters)
FeaturePlot(caf.integrate, features = c("MALAT1","S100A4","ANXA2","RPS15","YBX3","PHKG1","MFAP4","BIRC5","FABP3"),
min.cutoff = "q9")
We could als explore the differential genes between conditions control, DHT and E2 for cells of the same type. First, we will create a new metadata item in caf.integrate which contains both cell type and condition information. The codes below combines cell type cluster ID with condition names, and saves it as a new ident.
load("data/caf.integrate.RData")
caf.integrate$celltype.caf <- paste(Idents(caf.integrate), caf.integrate$orig.ident, sep = "_")
caf.integrate$celltype <- Idents(caf.integrate)
Idents(caf.integrate) <- "celltype.caf"
Then, we will use FindMarkers function to find differential genes between control and E2 for cell type 6, and report the top 15 genes. You can try yourself with other cell types of interest.
cluster6_ctrl_E2 <- FindMarkers(caf.integrate, ident.1 = "6_CAFE2", ident.2 = "6_CAFCTRL", verbose = FALSE)
head(cluster6_ctrl_E2, n = 15)
| p_val | avg_log2FC | pct.1 | pct.2 | p_val_adj | |
|---|---|---|---|---|---|
| <dbl> | <dbl> | <dbl> | <dbl> | <dbl> | |
| ACTA2 | 0.01489574 | 0.3238482 | 0.994 | 1.000 | 1 |
| IGFBP3 | 0.43156318 | 0.2957208 | 0.933 | 0.992 | 1 |
sessionInfo()
R version 4.1.0 (2021-05-18) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Rocky Linux 8.10 (Green Obsidian) Matrix products: default BLAS/LAPACK: /apps/spack/anvil/apps/openblas/0.3.17-gcc-11.2.0-2qrsari/lib/libopenblas_zenp-r0.3.17.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] getPass_0.2-4 fansi_1.0.6 crayon_1.5.3 digest_0.6.36 [5] utf8_1.2.4 IRdisplay_1.1 repr_1.1.6 lifecycle_1.0.3 [9] jsonlite_1.8.8 evaluate_0.24.0 pillar_1.8.1 rlang_1.1.0 [13] cli_3.6.1 uuid_1.2-0 vctrs_0.6.1 IRkernel_1.3.2 [17] tools_4.1.0 glue_1.7.0 fastmap_1.1.1 compiler_4.1.0 [21] base64enc_0.1-3 pbdZMQ_0.3-11 htmltools_0.5.4